Current status of the sequence of the rice genome and prospects for finishing the first monocot genome.

نویسنده

  • C Robin Buell
چکیده

Rice (Oryza sativa) is the first grass species to be sequenced, and as of September 2002, there are four draft genome sequences available. All four drafts are available to the academic community, although two drafts have some limitations with respect to access and distribution. Although none of the four draft sequences is complete, they collectively provide our first view of the landscape and the content of a monocot genome. The first rice genome sequence made accessible in large tracts was that of the O. sativa subsp japonica cv Nipponbare generated by the International Rice Genome Sequencing Project (IRGSP; Sasaki and Burr, 2000), an international consortium of public laboratories. Using a bacterial artificial chromosome (BAC)-byBAC approach, the IRGSP has generated draft sequence of 3,083 BAC or P1 artificial chromosome (PAC) clones that is available through GenBank/DNA data bank of Japan (DDBJ)/EMBL (as of September 17, 2002). These 3,083 BAC/PAC clones represent 426 Mb of sequence, and assuming an overlap of 15% between the clones, this would represent 362 Mb of unique sequence. With an estimated genome size of 430 Mb (Arumuganathan and Earle, 1991), this represents 84% of the rice genome. Alignment of the IRGSP sequence with 13,895 sequenced genetic markers reveals that 11,442 markers can be anchored to a BAC/PAC clone using high-stringency criteria (http://www.tigr.org/ tdb/e2k1/osa1/BACmapping/description.shtml), indicating that based on coverage of markers, the IRGSP sequence represents 82% of the genome. A graphic depiction of the anchoring of the BAC/PAC clones to the chromosomes can be viewed at http://www.tigr. org/tdb/e2k1/osa1/BACmapping/description.shtml. There is clearly representation throughout most of the chromosomes, with the exceptions occurring in the regions devoid of, or lacking in, a high density of genetic markers in which to anchor the BAC/PAC clones. Likewise, regions where it is technically difficult to identify BAC/PAC clones (telomeres, centromeres, and nucleolar-organizing regions) are under-represented in the IRGSP sequence. Although the majority of the IRGSP sequence is draft sequence, approximately a third of the sequence is finished (1,023 BAC/PAC clones as of September12,2002;http://www.tigr.org/tdb/e2k1/ osa1/BACmapping/description.shtml). In fact, manuscripts describing the sequence, annotation, and analysis of chromosomes 1 and 4 are in press (T. Sasaki and B. Hin, personal communication) and a manuscript on chromosome 10 is in preparation (C.R. Buell, W. McCombie, J. Messing, and R.A. Wing, personal communication) highlighting the role of the IRGSP in finishing the rice genome. In addition, the overall quality of draft sequence generated by the IRGSP is high with the bulk of the sequence being 10 , phase 2 sequence, with 10 being the level of sequence coverage and phase 2 reflecting the fact that the contigs are ordered and oriented when deposited in GenBank (http:// www.ncbi.nlm.nih.gov/HTGS/). Although the immediate goal of the IRGSP is completion of a phase 2 draft of the rice genome by the end of 2002 (http:// rgp.dna.affrc.go.jp/rgp/press_conference.html), the ultimate goal is that of a finished rice genome. Annotation for the IRGSP BAC/PAC clones is available for finished clones in GenBank/DDBJ/EMBL. Annotation data for unfinished sequences are generated through automated annotation processes and are available from The Institute for Genomic Research (http://www.tigr.org/tigr-scripts/e2k1/irgsp.spl) and the Rice Genome Program (http://rgp.dna.affrc. go.jp/giot/INE.html). Although manually curated annotation is always preferred over automated annotation, access to automated annotation for unfinished sequences provides a valuable resource for these unfinished sequences. Other analyses of the rice genome, such as alignment with expressed sequence tags from other monocot species, identification of motifs/domains within the rice proteome, analysis of repetitive sequences, and identification of syntenic sequences are available through several public sources (http://www.tigr.org/tdb/e2k1/osa1/; http://rgp. dna.affrc.go.jp/; http://www.gramene.org). Draft sequence of the same rice cv Nipponbare japonica sequenced by the IRGSP is available from two separate private sources, Pharmacia (Peapack, NJ) and Syngenta (San Diego). The Pharmacia draft sequence was generated using a BAC-by-BAC approach and represents 259 Mb of sequence (Barry, 2001). Access to this draft sequence is available to academic scientists 1 The work on rice genome sequencing at TIGR was supported by the U.S. Department of Agriculture (grant no. 99 –35317– 8275), by the National Science Foundation (grant no. DBI998282), and by the U.S. Department of Energy (grant no. DE–FG02–99ER2035). * E-mail [email protected]; fax 301– 838 – 0208. www.plantphysiol.org/cgi/doi/10.1104/pp.014878.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unveiling the genetic loci for a panicle developmental trait using genome-wide association study in rice

Panicle size has a high correlation with grain yield in rice. There is a bottleneck to identify the additional quantitative trait loci (QTL) for panicle size due to the conventional traits used for QTL mapping. To identify more genetic loci for panicle size, a panicle developmental trait (LNTB, the length from panicle neck-knot to the first primary branch in the rachis) related to panicle size ...

متن کامل

Advancing Chimeric Antigen Receptor-Engineered T-Cell Immunotherapy Using Genome Editing Technologies: Challenges and Future Prospects

Chimeric antigen receptor engineered-T (CAR-T) cells also named as living drugs, have been recently known as a breakthrough technology and were applied as an adoptive immunotherapy against different types of cancer. They also attracted widespread interest because of the success of B-cell malignancy therapy achieved by anti-CD19 CAR-T cells. Current genetic toolbox enabled the synthesis of CARs ...

متن کامل

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

Fingerprinting and genetic diversity evaluation of rice cultivars using Inter Simple Sequence Repeat marker

Rice as one of the most important agricultural crops has a putative potential for ensuring food security and addressing poverty in the world. In the present study, in order to provide basic information to improve rice through breeding programs, Inter Simple Sequence Repeat marker (ISSR) was used For DNA fingerprinting and finding genetic relationships among 32 different cultivars. In this study...

متن کامل

Independence of color intensity variation in red flesh apples from the number of repeat units in promoter region of the MdMYB10 gene as an allele to MdMYB1 and MdMYBA

MdMYB10 gene expression results in accumulation of anthocyanin in many tissues including flesh of applefruit. The MdMYB1 and MdMYBA genes are close homologues to MdMYB10 gene and both are responsiblefor red color phenotype in apple fruit skin. In the current study, an apple genome sequence draft analysisindicated that these three genes are located in a unique contig. Further a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Plant physiology

دوره 130 4  شماره 

صفحات  -

تاریخ انتشار 2002